After creating an approximate regex object, configure it to perform an exact or approximate search and to broadcast notifications as the search is conducted. Configurable settings include:
-
Search patterns. Choose one or more regular expression patterns to match text from the search domain. Use the function IG_REC_approx_regex_pattern_set to specify one or more valid regular expression patterns.
The approximate regex object supports POSIX 1003.2 Extended Regular Expression (ERE) syntax and the Basic Regular Expression (BRE) syntax. The supported regular expression syntax is elaborated in the Regular Expressions page.
- Case-sensitive matches. Case-sensitive matches consider uppercase and lowercase letters as distinct letters. Case-insensitive consider uppercase and lowercase letters as the same letter. Use the function IG_REC_approx_regex_is_case_sensitive_set to toggle between case-sensitive and case-insensitive matches.
- Greedy matches. When two or more matches are made at the same position, the longest match is classified as the greedy match. Use the function IG_REC_approx_regex_is_greedy_set to enable or disable greedy matches.
-
Fuzzy matches. Fuzzy matches tolerate incorrect, missing, or extraneous letters in the search domain. A fuzzy search will locate substrings within a search domain that only match a pattern after one or more characters are inserted, deleted, or substituted from the pattern:
- Insert count. An insert adds a character to produce a match. For example, the pattern "she" will match "shoe" if the letter 'o' is inserted. Likewise, the pattern "lent" will match "learnt" if the letters 'a' and 'r' are inserted. Use the function IG_REC_approx_regex_maximum_insert_count_set to specify the maximum number of inserts that the search will tolerate.
- Delete count. A delete removes a character to produce a match. For example, the pattern "she" will match "he" if the letter 's' is deleted. Likewise, the pattern "them" will match "he" if the letter 't' and 'm' are deleted. Use the function IG_REC_approx_regex_maximum_delete_count_set to specify the maximum number of deletes that the search will tolerate.
- Substitute count. A substitute replaces a character to produce a match. For example, the pattern "milk" will match "mi1k" if the character '1' is replaced with 'l'. Likewise, the pattern "jail" will match "pain" if 'p' is replaced with 'j' and 'l' is replaced with 'n'. Use the function IG_REC_approx_regex_maximum_substitute_count_set to specify the maximum number of substitutions the search will tolerate.
- Error count. Each insert, delete, or substitute applied to the search domain to induce a match is treated as an error. After a maximum number of errors are encountered, the potential match is rejected and the search continues. Use the function IG_REC_approx_regex_maximum_error_count_set to specify the maximum number of errors the search will tolerate.
-
Notifications. Install callbacks to receive notification that a word is recognized, a match is made, and forward-progress is noted during the search:
- Recognize word callback. The recognize word callback is invoked for each recognized word during preparation of the search domain. These notifications are encountered prior to actually searching for pattern matches. Use the function IG_REC_approx_regex_recognize_word_cb_set to install a user-defined recognize word callback.
- Match callback. The match callback is invoked after each successful match is found. This callback presents an opportunity for the application to inspect and reject a match result. Rejected matches are excluded from the array of matches returned at the successful conclusion of the search. Use the function IG_REC_approx_regex_match_cb_set to install a user-defined match callback.
- Progress callback. The progress callback is invoked periodically during the search operation to report the estimated percentage of the search completed. This callback also presents an opportunity for the caller to stop the current search prior to completion. Use the function IG_REC_approx_regex_progress_cb_set to install a user-defined progress callback.